RAW SEQUENCE LISTING 
ERROR REPORT 



The Biotechnology Systems Branch of the Scientific and Technical Information 
Center (STTQ detected errors when processing the following computer readable 
form: _ J=% ^ 

Application Serial Number: 

Source: 0 1 ?f . 

Date Processed by STIC: [gWjoj 



THE ATTACHED PRINTOUT EXPLAINS DETECTED ERRORS. 

PLEASE FORWARD THIS INFORMATION TO THE APPLICANT BY EITHER: 

1) INCLUDING A COPY OF THIS PRINTOUT IN YOUR NEXT COMMUNICATION TO THE 
APPLICANT, WITH A NOTICE TO COMPLY or, 

2) TELEPHONING APPLICANT AND FAXING A COPY OF THIS PRINTOUT, WITH A 
NOTICE TO COMPLY 

FOR CRF SUBMISSION QUESTIONS, PLEASE CONTACT MARK SPENCER, 703-308-4212. 

FOR SEQUENCE RULES INTERPRETATION, PLEASE CONTACT ROBERT WAX, 703-308-4216. 
PATENTED 2.1 e-mail help: patin21help@,uspto.gov or phone 703-306-4119 (R. Wax) 
PATENTTN 3.0 e-mail help: patin3help@uspto.gov or phone 703-306-4119 (R. Wax) 

TO REDUCE ERRORED SEQUENCE LISTINGS, PLEASE USE THE CHECKER 
VERSION 3.0 PROGRAM . ACCESSIBLE THROUGH THE U.S. PATENT AND 
TRADEMARK OFFICE WEBSITE. SEE BELOW: 




Checker Version 3.0 

The Checker Version 3.0 application is a state-of the-art Windows based software program 
employing a logical and intuitive user-interface to check whether a sequence listing is in 
compliance with format and content rules. Checker Version 3.0 works for sequence listings 
generated for the original version of 37 CFR §§1.821 - 1.825 effective October 1, 1990 (old 
rules) and the revised version (new rules) effective July 1, 1998 as well as World Intellectual 
Property Organization (WIPO) Standard ST.25. 

Checker Version 3.0 replaces the previous DOS-based version of Checker, and is Y2K- 
compliant. Checker allows public users to check sequence listings in Computer Readable form 
(CRF) before submitting them to the United States Patent and Trademark Office (USPTO). 
Use of Checker prior to filing the sequence listing is expected to result in fewer errored sequence 
listings, thus saving time and money. 



Checker Version 3.0 can be down loaded from the USPTO website at the following address: 



http://www.uspto.gov/web/offices/pac/checker 
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OIPE 



RAW SEQUENCE LISTING DATE: 12/06/2001 

PATENT APPLICATION: US/09/996,611 TIME: 15:15:15 

Input Set : A:\NEWTEXT.txt 
Output Set: N:\CRF3\12062001\I996611.raw 

SEQUENCE LISTING 

W--> 3 SEQ ID NO: 1 



P 068 Not Comply 
C0,TQC ^ Diskette Needed 



file://C:\Crf3\Outhold\VsrI99661 1 .htm 



12/6/01 
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VERIFICATION SUMMARY DATE: 12/06/2001 

PATENT APPLICATION: US/09/996,611 TIME: 15:15:16 

Input Set : A:\NEWTEXT.txt 

Output Set: N:\CRF3\12062001\I996611.raw 

L:3 M:244 W: Invalid beginning of sequence listing, Line=[SEQ ID NO: 1], General Header Line 
Not Processed! 



file://C:\Crf3\Outhold\VsrI99661 1 .htm 



12/6/01 



SEQUENCE LISTING 
SEQ ID NO: 1 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 954 amino acids 

(B) TYPE: amino acid 

(ii) MOLECULE TYPE: protein 

(iii) FEATURE : 

(A) NAME: Alpha 1 chain collagen 

(B) OTHER INFORMATION: /note="Where 
P=P*=Hydroxyproline" 

MAHYITFLCMVLVLLLQNSVLAEDGEVRSSCRTAPTDLVFILDGSYSVGP 50 
ENFEIVKKWLVNITKNFDIGPKFIQVGWQYSDYPVLEIPLGSYDSGEHL 100 
TAAVESILYLGGNTKTGKAIQFALDYLFAKSSRFLTKIAWLTDGKSQDD 150 



PCI PI/EX EC/22/6 
Annex VU. page 29 



<1 10> 
<120> 
<130> 



<1 <0> 
< 1 1 1 > 



<1 S0> 
< 1 S 1 > 



Smith, John; Smithgcnc Inc. 
Example of a Sequence Listing 
01 -00001 



PCT/EP98/00001 
1996-12-31 



US 08/999.999 
1997-10-15 



< 1C0> 

< 1 70> 



Patcntln version 2.0 



< 2 1 0 > 

< 2 1 1 > 

< 2 1 2 > 

< 2 1 3 > 



1 

189 

UNA 

Paramecium sp. 



<220> 
<22 1 > 

< 2 2 2 > 

<300> 

< 30 1 > 

< 302> 

< 303> 
<304> 
<305> 
<30C> 

< 307> 

< 308> 

< 309> 



CDS 
1279) . 



( 389 ) 



Doc. Richard 

Isolation and Clia rac t cr i i a t i on of a Ccnc Encoding a 

Protease from Paramecium sp. 

Journal o( Genes 

1 

4 

1-7 

1988-0G-3! 
1 2 3 4 SC 
1900-06-31 



<400> 1 

agctgiogic attcctgtgi ccicicctci cigggcuci caccccgcta atcagacctc 

agggagagtg tcttgaccct cctctgcctt tgeagcttea caggcaggc^a ggcaggcagc 

tgatgtggca attgetggea gtgecacagg cttctcagcc aggcttaggg cgggtcccgc 

cgcggcgcgg cggcccctct cgcgctcctc tcgcgcctct ctctcgctct cctctcgctc 



CO 
120 
100 
240 



3 



l»ClPI/I-XliC/22/6 
Annex VII. page 30 



Appendix 3. page 2 



ggacctgatt aggtgagcag gaggaggggg cagttagc 



acg gtt cca atg ttc age 29^ 
Hct Val Scr Met Phc Scr 
1 5 



111 s; x-E.si-K-s; - $ E S5 Kf $ a SE St; 

10 15 

tgt ccc aaa gtc etc ccc tgt cac tea tea ctg cag ccg . aat ctt 3 89 

Cys Pro Lys -Val Leo Pro Cys His Ser Scr -Lcu Cln Pro Asn-^cu- 

25 30 \ 35 



<210> 
< 2 1 1 > 
<212> 
<213> 



2 

37 
PRT 

Paramecium sp. 



Mc? 0> val Scr" net Phc Scr Leu Scr Pl.c Lys Trp Pro Cly Phc. Cys Leu 
1 5 10 ^ 



Phc val Cys Lcu Pl.c Cln Cys Pro i.ys Val l.cu Pro Cys His Scr Scr 



20 



25 



Lcu Cln Pro Asn Lcu 



<21 0> 

< 2 1 1 > 

< 2 1 2 > 

< 2 1 3 > 



) 

11 
PRT 

Artificial Sequence 



<220> 
< 2 2 3 > 



Designed peptide based on siic and polarity to act as a 
linker between the alplia and beta chains of Protein XYZ. 



<400> 3 

Met Val Asn Lcu Clu Pro Met "is Thr Clu lie 
1 5 10 



<210> 
<«00> 
000 



(Annex VIII follows| 



KvquHC.Hcnis foi Appl.conom ■ OC Ujk ?3 /un.: I 



,,„,. //.^." uW'o C o»-/..vb/ofl.ccs/co...-'$oi'o e /iV9£/„. ce ,. J J/(i3uc ^ 



cable. The numeric identifier shall be used only xn Che Sequence 
listing." The order and presentation of the items of information in the 
•Sequence Listing" shall conform to the arrangement given below Each 
item of information shall begin on a new line and shall begin with the 
numeric identifier enclosed in angle brackets as shown. The submission 
of those items of information designated with an "M" is mandatory. The 
submission of those items of information designated with an O is 
optional. Numeric identifiers <110> through <170> shall only., be set 
forth at the. beginning of the . "Sequence Listing." The following table., 
illustrat-cs the numeric identifiers. 



Numeric 
Idcnti ficr 

<110> 



Definition 



Applicant 



Comments and 
Format 

Preferably max. 
of 10 names; 
one name per line; 
preferable .format: 
Surname, Other /• 
Names and/or 
Initials 



Mandatory (M) or 
Optional .(O) 

M v 



<120> 



<130> 



Title of 
Invention 

File Reference 



Personal file 
re f crcncc 



M. when filed prior 
to assignment of 
appl . numbc r 



<\t0> 



Cur rent Appl ica ■ 
Lion Number 



Specify as: 

US 07/999, 999 or 

PCT/US96/99999 



if a va liable 



< 1 4 1 > 



Current Filing 
Date 



Specify as: yyyy-mm-dd M, if available 



<1S0> 



Prior Application Specify as: 
Number US 07/999, 999 or 

PCT/US96/99999 



M, if applicable 
include priority 
documents under 
3S USC 119 and 
120 



<1S1> 



Prior Application Specify as: yyyy-mm-dd M, if applicable 
Filing Date 



< 1 G0> 



Number of SEQ ID 
NOs 



Count includes 
total number of 
SEQ ID NOs 



<170> 



So f twa re 



Name of software used 
to create the 
Sequence Listing 



<210> 



SEQ ID NO: H : 



Response shall be an 
integer repre- 
senting the SEQ 
I D NO shown 



<2 1 1 > 



Length 



Respond with an integer M 
expressing the number 
of bases oc amino acid 
res iducs 



■< i.f }a 



1/29/99 I S3 I'M 



It KM A|'|>Ii«..h 1 omi • OC IJjlf ?J 



I99S 



l.iil- 



<21 2> 



Type 



Whether presented 
sequence mo 1 c - 
eule is DNA, 
RNA, or PRT 
(protein) . 'I f 
a nucleotide 
sequence con- 
tains both DNA 
and RNA frag- 
ments, the 
type shall be 
"DNA." In ad- 
dition, the 
combined DNA/ 
RNA molecule 
shall be further 
described in 
the <220> to 
<223> feature 
section. 



<213> 



Organism 



Scientific name, 
i.e. Gcnus/spccics , 
Unknown or Artifi- 
cial Sequence. In 
addition, the 
"Unknown" or 
"Artificial Se- 
quence" organisms 
shall be further 
described in the 
<220> to <223> 
feature section. 



<220> 



Tea tu re 



Leave blank after 
<220>. <221-223> 
provide for a 
description of 
points of bio- 
logical signi- 
ficance in the 
sequence . 



M, under the 
following condi- 
t i ons : if "n, " 
"Xaa , " or a mod- 
i f i cd or unusua 1 
L-amino acid or 
modified base was 
used in a se- 
quence; if ORGAN- 
ISM is "Artifi- 
cial Sequence" or 
"Unknown" ; i f 
molecule is 
combined DNA/ RNA . 



<22 1 > 



Name/ Kc y 



Provide appropriate 
ident i f i er f or 
feature, pre- 
ferably from 
W I PO Standa rd 
ST. 2b ( 1998) , 
Appendix 2, 
Tables S and 6 



M, under the fol - 
lowing conditions 
i f "n, " "Xaa. " or 
a mod ified or un- 
usual L-amino 
acid or modified 
base was used in 
a sequence 



<222> 



Locat ion 



Spcci f y location 
within sequence; 
where appropriate 
state number of 
first and last 
bascs/amino acids 



M, under the fol- 
lowing conditions 
if "n , " "Xaa , " or 
a mod l f i ed or un- 
usua 1 L- ami no 
acid or mod i f i cd 



1 99 Z 



I. M! 



<223> 



Other Infor- 
mation 



in feature 



Other relevant 

information; 

four lines maximum 



base was used in 
a sequence 

M, under the fol- 
"lbwing conditions: 
if "n, " "Xaa, " or 
a modified or un- 
usual L-amino acid 
or modified base 
was used in a ~ 
sequence; if 
ORGANISM 
is "Artificial 
Sequence" or . 
"Unknown"; i-fj=» 
molecule is com- 
bined DNA/RNA. 



<300> 



<301> 



Publ i ca tion 
I n forma tion 

Authors 



Leave bl^nk 
after <300> 

Preferably max 
of ten named 
authors of publi- 
cation; specify 
one name per line, 
preferable format: 
Surname, Other 
Names and/or 
Initials 



O 



< 302> 

< 303> 

< 301 > 
<305> 

< 30G> 

< 307> 



< 300> 



Title 
Jou rna 1 
Vol ume 
Issue 
Page s 
Date 



Da taba se 
Acces s i on 
Numbe r 



Journal date on which 
data published; 
speci f y as yyyy-mm- 
dd, MMM-yyyy or 
Scason-yyyy 

Accession number 
assigned by data- 
base including 
database name 



0 
O 



< 309> 



< 3 10> 



Database Entry 
Date 



Patent Document 
Numbe r 



Date of entry in 
database, specify 
as yyyy-nvn-dd 
MMM-yyyy 



or 



Document number; 
for patent-type 
citations only. 
Speci fy as , for 
c x a mp 1 c . US 
0^/999. 999 



, ,„„ I, up //.>■"•» "li-co ri"''»vl»'on'iCcv'<:<J">'< 



< 3 1 1 > Patent riling Document filing 

Date date, for patent- 



type citations only; 
specify as yyyy-mm-dd 



<31 2> 



Publication Date 



Document publication 
date, for 
patent-type 
citations only; , 
specify as yyyy-mm-dd; 



<313> 



Relevant 
Residues 



FROM (position) TO 
(posi tion) 



<400> 



Sequence 



SEQ ID NO should 
follow the 
numeric identifier 
and should appear 
on the line pre- 
ceding the actuaf 
sequence 



Section 1.024 is revised to read as follows; 



1.024 Form and format for nucleotide and/or amino acid sequence 
submissions in computer readable form. 



(a) The computer readable form required by 1.021(c) shall meet the 
following specifications: 

(!) The computcr_ readable form shall contain a single "Sequence Listing' 
as either a diskette, series of diskettes, or other permissible media' 
outlined in paragraph (c) of this section. 

(2) The "Sequence Listi-ng"- in paragraph (a) (1) of this section shall be 
submitted in American Standard Code for Information Interchange (ASCII) 
text. No other formats shall be allowed. 

(3) The computer readable form may be created by any means, such as word 
processors, nucleotide/amino acid sequence editors or other custom 
computer programs; however, it shall conform to all specifications 
detailed in this section. 



(4) Pile compression is acceptable when using diskette media, so long as 
the compressed file is in a sc 1 f -e>: t ract ing format that will decompress 
on one of the systems described in paragraph (b) of this section. 

(5) Page numbering shall not appear within the computer readable form 
version of the "Sequence Listing" file. 

(6) All computer readable^ forms shall have a label permanently affixed 
thereto on which has been' hand-printed or typed: the name of the 
applicant, the title of the invention, the date on which the data were 
recorded on the computer readable form, the operating system used, a 
reference number, and an application serial number and filing date, if 
known . 



(b) Computer readable form submissions must meet these format 
r cqu i r emcn t s : 

(1) Computer: I OM PC/XT/ AT. or compatibles, or Apple Maci ntosh; 



(21 Operating System: MS-DOS. Unix or Macintosh; 



